Geothermal machine learning analysis: Southwest New Mexico

This notebook is a part of the GeoThermalCloud.jl: Machine Learning framework for Geothermal Exploration.

geothermalcloud

Machine learning analyses are performed using the SmartTensors machine learning framework.

SmartTensors

This notebook demonstrates how the NMFk module of SmartTensors can be applied to perform unsupervised geothermal machine-learning analyses.

nmfk

More information on how the ML results are interpreted to provide geothermal insights is discussed in our research paper.

Introduction

Southwest New Mexico (SWNM) is a region important for geothermal exploration.

SWNM is broadly divided into four physiographic provinces:

Each of the SWNM physiographic provinces is associated with different types of unique hydrothermal systems with temperatures ranging from low (<90℃) to medium (90-150℃).

Some of the SWNM systems are already utilized for commercial and recreational purposes.

There are already energy-production facilities for both electricity and direct-use heating.

For example, the Basin and Range province has one geothermal power plant (Lightning dock) of gross ~14 MWe power, five greenhouse farms, and numerous medium-temperature wells and springs. There are 14 spas and recreational facilities utilizing the SWNM geothermal resources.

Recent Play Fairway Analysis (PFA) Phase I study of SWNM conducted at LANL revealed more potential geothermal resources.

The study area and the data collection locations are mapped below.

SWNM study area

Import required libraries for this work

If NMFk is not installed, first execute in the Julia REPL import Pkg; Pkg.add("NMFk"); Pkg.add("DelimitedFiles"); Pkg.add("JLD"); Pkg.add("Gadfly"); Pkg.add("Cairo"); Pkg.add("Fontconfig"); Pkg.add("Mads").

Load and pre-process the data

Setup the working directory containing the SWNM data

Load the SWNM data file

Define names of the data attributes (matrix columns)

Short attribute names are used for coding.

Long attribute names are used for plotting and visualization.

Define attributes to remove from analysis

Define attributes for analysis

Define names of the data locations

Short location names are used for coding.

Long location names are used for plotting and visualization.

Define location coordinates

Set up directories to store obtained results and figures

Define a range for the number of signatures to be explored

Define and normalize the data matrix

Perform ML analyses

The NMFk algorithm factorizes the normalized data matrix Xu into W and H matrices. For more information, check out the NMFk website

Here, the NMFk results are loaded from a prior ML run.

As seen from the output above, the NMFk analyses identified that the optimal number of geothermal signatures in the dataset 5. This estimate is based on a criterion that Silhouette (robustness) of the acceptable solutions is >0.5. However, if a criterion for Silhouette is >0.25, the optimal number of signatures is 8.

It is important to note that our ML methodology can be applied to perform both classification and regression analyses.

For the case of regression (predictive) analyses, the optimal number of signatures is 5.

Solutions with a number of signatures less than 5 are underfitting.

Solutions with a number of signatures greater than 5 are overfitting and unacceptable.

The solution for k=8 is also analyzed below because it provides further refinements in the extracted geothermal signatures. It also provides further demonstration of the classification capabilities of our ML methodology.

The set of acceptable solutions are defined by the NMFk algorithm as follows:

The acceptable solutions contain 2, 3, 4, and 5 signatures.

Post-process NMFk results

Number of signatures

Below is a plot representing solution quality (fit) and silhouette width (robustness) for different numbers of signatures k:

The plot above also demonstrates that the acceptable solutions contain 2, 3, 4, and 5 signatures.

Analysis of all the acceptable solutions

The ML solutions containing an acceptable number of signatures are further analyzed as follows:

Analysis of the 5-signature solution

The results for a solution with 5 signatures presented above will be further discussed here.

The geothermal attributes are clustered into 5 groups:

This grouping is based on analyses of the attribute matrix W:

attributes-5-labeled-sorted

The well locations are also clustered into 5 groups:

This grouping is based on analyses of the location matrix H:

locations-5-labeled-sorted

The map ../figures-case01/locations-5-map.html provides interactive visualization of the extracted location groups (the html file can also be opened within any browser).

Comparison of the ML solutions against the SWNM physiographic provinces

The spatial association of the extracted signatures with the four physiographic provinces in SWNM is summarized in the figure below:

signatures

The solutions for k=2, 3, and 4 provide a higher-level classification of the geothermal locations, while the k=8 solution allow us to further refine the geothermal signatures and their association to the physiographic provinces. The solution for k=5 provides the best classification of the geothermal locations.

Based on the figure above, it is clear that our ML algorithm was able to blindly identify the physiographic provinces associated with analyzed hydrogeothermal systems without providing any information about their spatial location (coordinates).

Further observations based on the figure above are:

Description of location matrices (W)

The plot below shows location matrices (W) of the extracted signatures for all the accepted solutions together. From left to right, the number of signatures increases. The matrices are color-coded to show high (red) and low (green) associations between the locations and signatures. Like the maps above, this figure below demonstrates how the signatures get transformed and modified as the number of signatures increases. The transitions of the signatures show the consistencies of the obtained results.

Ws

Further observations based on the figure above are (note that these observations are consistent with the observations provided above regarding the physiographic provinces):

Description of attribute matrices (H)

The plot below shows attribute matrices of all the accepted solutions. The number of signatures increases from left to right. The figure demonstrates how each attribute contributes to the extracted signature. The matrices are color-coded to show high (red) and low (green) associations between the attributes and signatures. Also, this plot shows how the signatures get transformed and modified as the number of signatures increases. As above, the transitions of the signatures show the consistencies of the obtained ML results.

Hs

Further observations based on the figure above are:

Optimal geothermal signatures charecterizing SWNM region

The figure below shows the map of the optimal signatures. The k=5 solution best characterizes the spatial associations and geothermal attributes of the SWNM.

Ws

Signatures and their relationships to resource types, geothermal attributes, physical processes and physiographic provinces

Signature Resource type Dominant attributes Physical Significance Physiographci province
A Low temperature

Gravity anomaly
Magnetic intensity
Volcanic dike density
Drainage density
Li+ concentration

Shallow heat flow Southern MDVF
B Medium temperature

B+ and Li+ concentrations
Gravity anomaly
Magnetic intensity
Quaternary fault density
Silica geothermometer
Heat flow
Depth to the basement

Deep heat flow Southern Rio Grande rift
C Low temperature

B+ and Li+ concentrations
Magnetic intensity
Drainage density
Crustal thickness

Deep heat source Colorado Plateau
D Low temperature

Drainage density
Fault intersection density
Seismicity
State map fault density
Spring density
Hydraulic gradient

Tectonics Northern Rio Grande Rift
E Medium temperature

Drainage density
State map fault density
Precipitation
Silica geothermometer
Hydraulic gradient

Vertical hydraulics Northern MDVF

Geothermal resource assessment

Medium-temperature hydrothermal systems
Low-temperature hydrothermal systems

For more details, please look at our paper titled: "Discovering Hidden Geothermal Signatures using Unsupervised Machine Learning."